Image processing

Optimization method of PSO-PID control for interferometric closed-loop fiber optic gyroscope
Liu Shangbo, Dan Zesheng, Lian Baowang, Xu Jintao, Cao Hui
2024, 53(3): 20230626. doi: 10.3788/IRLA20230626
[Abstract](29) [FullText HTML] (6) [PDF 2486KB](15)
  Objective   PSO-PID control optimization algorithm based on the interferometric closed-loop fiber optic gyroscope has been widely used in military and civil fields, such as aerospace, defense equipment, navigation survey, vehicle inertial navigation system and other industrial systems. These applications are developing in the direction of lightness, low power consumption, long life, high reliability, no self-locking and mass production. PSO-PID controller can improve the dynamic response of fiber optic gyroscope and effectively track the angular rate input of fiber optic gyroscope. Fiber optic gyroscope is based on Sagnac effect in closed optical path, so its bandwidth is much larger than that of traditional gyroscope. In digital closed-loop fiber optic gyroscope, the response speed of optical path is very fast, and the system bandwidth is mainly determined by the detection circuit. Choosing a suitable digital controller is helpful to improve the dynamic performance of fiber optic gyroscope.  Methods   The system block diagram of fiber optic gyroscope (Fig.1) is established, and the ICFOG closed-loop system is equivalent to a mathematical model (Fig.2) by analyzing the working principle of fiber optic gyroscope, and finally the closed-loop discrete control system is deduced. On this basis, a new PSO-PID compound controller is designed (Fig.3), and the optimization algorithm steps of PID controller of standard PSO are analyzed (Fig.4). The controller can adjust parameters \begin{document}${K_p}$\end{document},\begin{document}${K_i}$\end{document} and \begin{document}${K_d}$\end{document} online during operation (Fig.15). At the same time, by comparing with the PID parameter tuning method of BP neural network (Fig.5), fuzzy PID parameter tuning method (Fig.6) and PID control method, the advantages of PSO-PID control are illustrated by comparing the angular rate input tracking speed of fiber optic gyro (Fig.12) and the angular rate input tracking error of fiber optic gyro (Fig.13).  Results and Discussions  Using PSO-PID control method, it is found that the fitness value changes rapidly. When the number of iterations is 15, the fitness value can reach the optimal solution, and the optimal solution is 21.892 5. At the same time, the tracking time of FOG angular rate input is 1.2 s. Compared with BP-PID, PID, and F-PID control methods, the tracking speed is increased by 1.91, 3.5 and 1.75 times respectively. After the PSO-PID control method, the tracking error is \begin{document}$4.7 \times {10^4}$\end{document} m, which is smaller than other control methods. Compared with F-PID, BP-PID and PID control methods, its control accuracy is improved by 45.27%, 46.03% and 66.30% respectively. According to the comparison of dynamic performance of different control methods (Tab.1), it is known that PSO-PID controller can achieve the control goal quickly and has a small tracking error.  Conclusions   Based on the mathematical model of fiber optic gyro, this paper puts forward an optimization scheme of fiber optic gyro digital controller. The traditional digital controller is improved, and the PSO-PID controller is proposed and simulated. Compared with many control methods, the simulation results show that PSO-PID controller can shorten the adjustment time and reduce overshoot, thus effectively improving the dynamic performance of fiber optic gyroscope on the premise of ensuring stability, and has important engineering significance and practical value. To apply this optimization scheme to engineering practice, more external factors and more detailed control parameter analysis need to be considered, which will be the focus of later research.
Image highlight removal method based on parallel multi-axis self-attention
Li Pengyue, Xu Xinying, Tang Yandong, Zhang Zhaoxia, Han Xiaoxia, Yue Haifeng
2024, 53(3): 20230538. doi: 10.3788/IRLA20230538
[Abstract](42) [FullText HTML] (4) [PDF 7138KB](20)
  Objective  Highlights are manifested as high bright spots on the surface of glossy materials under the action of light. The highlights of the image can obscure background information with different degrees. The ambiguity of the image highlight layer model and the large dynamic range of highlights enable highlight removal to be still a challenging visual task. The purely local methods tend to result in artifacts in the highlight areas of the image, and the purely global methods tend to produce color distortion in highlight-free areas of the image. To address the issues caused by the imbalance of local and global features in image highlight removal and the ambiguity of highlight layer modeling, we propose a threshold fusion U-shaped deep network based on parallel multi-axis self-attention mechanism for image highlight removal.  Methods  Our method avoids the ambiguity of highlight layer modeling by implicit modeling. It uses the U-shaped network structure to combine the contextual information with the low-level information to estimate the highlight-free image, and introduces a threshold fusion structure between the encoder and decoder of the U-shape structure to further enhance the feature representation capability of the network. The U-shaped network uses the contraction convolution strategy to extract the contextual semantic information faster. It gradually recovers the low-layer information of the image by expanding, and connects the features of the various stages of the contraction path in the corresponding stages of the expansion path. The threshold mechanism between the encoder and decoder is used to adjust the information flow in each channel of the encoder, which allows the encoder to extract features related to highlights as much as possible at channel level. The threshold structure first performs high- and low-frequency decoupling and feature extraction for the input features, then fuses the two types of features by pixel-wise multiplication, and finally uses the residual pattern to learn the low-level features complementary. In addition, the parallel multi-axis self-attention mechanism is used as the unit structure of the U-shaped network to balance the learning of local and global features, which eliminates the distortion and artifacts of the recovered highlight-free images caused by the imbalance extraction of local and global features. The local self-attention calculates local interactions within a small P*P window to form local attention. After the correlation calculation of the small window, the window image is mapped to an output image with the same dimension as the input image by the inverse operation of the window segmentation operation. Similarly, the global self-attention divides the input features into G*G grids with larger receptive fields. Each grid is a cell for calculating correlation, which has an adaptive size of the window space. The larger receptive field window of calculating correlation facilitates the extraction of global semantic information. For the loss function, the squared loss and the mean absolute error loss are the widely used loss functions in the image restoration field. The squared penalty magnifies the difference between large and small errors. It usually results in excessively smooth restored images. Therefore, the mean absolute error loss is used as the loss function to train our network.  Results and Discussions  Qualitative experiments on real highlight images show that our method can remove highlights from images more effectively, and other compared methods usually cannot remove highlights accurately and efficiently. They are prone to produce artifacts and distortion in highlight-free areas of the image. Quantitative experiments on real-world highlight image datasets show that our method outperforms five other typical image highlight removal methods in both PSNR and SSIM metrics. The PSNR values are higher than those of the second-best method by 4.10 dB, 7.09 dB, and 6.58 dB on the datasets of SD1, RD, and SHIQ, respectively. The SSIM values of our method also outperform those of the second-best method with gains of 4%, 9%, and 3% on three datasets. In addition, we also conduct ablation studies for the network structure, and the experiment verifies the effectiveness of the threshold fusion module and the parallel multi-axis self-attention module; The threshold fusion module can increase the PSNR by 0.68 dB and the SSIM by 1%, and the multi-axis self-attention module can increase the average PSNR value by 0.55 dB and the SSIM by 1%. It can also be seen from the visual results of each ablation experimental model that with the gradual optimization of the network structure, the results of image highlight removal are visually improved. The outputs of the pure convolution-based deep network models of MI and M2 have more highlight residuals and produce distortion in the highlight-free areas of the image. The models of M3, M4 and M5 combining CNN with the self-attention module visually achieve better results.  Conclusions  The experimental results show that good visual results for highlight removal on both public natural and textual image datasets are achieved with our method, which outperforms other methods in terms of quantitative evaluation metrics.
Fast calculation of radiative heat transfer coefficient between diffuse and non-diffuse surfaces
Li Fubing, You Qi, Leng Junmin, Yang Linhao
2024, 53(3): 20230611. doi: 10.3788/IRLA20230611
[Abstract](34) [FullText HTML] (5) [PDF 2327KB](23)
  Objective  Radiative heat transfer is one of the three basic modes of heat transfer and has an important impact on the study of the temperature distribution and infrared radiation characteristics of the outer surface of a space target. For solving the radiation heat transfer between incomplete gray-diffuse surfaces system (both diffuse surfaces and non-diffuse surfaces in the model), there is usually a lack of corresponding analytical solutions. The Monte-Carlo method has the advantage of good calculation accuracy, but it has the disadvantage of long calculation time. In order to solve this problem, this paper proposes a representation of the ray reflection energy of the diffuse surface and a calculation method for the radiative transfer coefficient of the incomplete diffuse surfaces system based on the diffuse reflection characteristics of the diffuse surface. This reduces multiple ray tracing before the ray energy threshold is reached and improves the calculation speed.  Methods  A method for expressing the reflected energy of diffuse surfaces is proposed. When the Monte-Carlo method is used to conduct ray tracing, if a ray hits a diffuse surface, the reflection energy of the ray is defined as the diffuse emission beam set in the upper half space of the surface using the diffuse reflection characteristics of the diffuse surface (Fig.2(b)). In addition, modifications were made to the Monte-Carlo tracking process. First, the radiative transfer coefficient of the diffuse surface was calculated using the Monte-Carlo method. Then, when calculating the radiative transfer coefficient of diffuse surfaces, if the light emitted from the surface intersects with the diffuse surface, the radiative transfer coefficient of the other surfaces is multiplied by the reflected energy, and the reflected energy absorbed by the other surfaces is calculated to end the ray tracing process, avoiding subsequent multiple ray tracing and achieving fast calculation of diffuse surfaces to improve the computational efficiency of the entire system (Fig.5).  Results and Discussions   Using the cube model and assuming that the No.1 and No.2 surfaces are diffuse and the rest of the surfaces are specular (Fig.4(a)), the radiative transfer coefficient from each surfaces to the other surfaces (including itself) are calculated using the Monte-Carlo method and the fast algorithm proposed in this paper, and the results of the calculations are shown (Tab.1-2). It can be seen that both of them have the same accuracy, but as for the computation time, the new method is more efficient due to the significant reduction of the number of ray tracing times in Monte-Carlo (Fig.6). In addition, the 13-facet L-type unenclosed cavity model is used as proof to show that the method is also applicable in complex models. Finally, taking the cube model as an example, the advantages of the fast method compared with the Monte-Carlo method are analyzed from the theoretical point of view. For the emitted beam on a non-diffuse surface, the average tracing times of its rays are much smaller than that of the Monte-Carlo method, and the higher the reflectivity of the surface element, the more significant the computational advantages are. For example, if the energy threshold is 0.001, the number of diffuse reflective surfaces in the model is 2. When the surface reflectance is 0.4, the calculation time of the non-diffuse reflective surfaces in the fast calculation method is 0.307 times that of the Monte-Carlo method. When the reflectance of the surface is increased to 0.8, the calculation time of the non-diffuse reflective surface is only 0.081 of that of the Monte-Carlo method, which is more than ten times higher.  Conclusions  Aiming at the problem of the traditional Monte-Carlo method in calculating the radiative transfer coefficient between diffuse surfaces and diffuse surfaces for a long time, a fast calculation method is proposed. Firstly, the realization principle of the method and the difference with Monte-Carlo method are introduced, and then the cube model and L-shape unenclosed cavity model are used to compare the calculation results and calculation time of the fast method to Monte-Carlo method, which illustrates the advantages of fast method over Monte-Carlo method in terms of the calculation efficiency, and then the coefficient affecting the calculation advantages of the method are illustrated through the theoretical analysis, and finally the outlook on the next step of how to improve the calculation time of the diffuse surface elements is proposed.
Survey of research methods in infrared image dehazing
Tang Wenjuan, Dai Qun
2024, 53(2): 20230416. doi: 10.3788/IRLA20230416
[Abstract](196) [FullText HTML] (38) [PDF 1926KB](91)
  Significance   Infrared image dehazing refers to the process of restoring the contrast and visual quality of the infrared imaging system by removing the influence of haze, smoke and other media on the infrared image in the presence of atmospheric turbulence. Infrared images are widely used in military, security, medical, energy exploration and other fields by virtue of the advantages of all-day and no light limitation. Enhanced Image Visibility : Infrared images captured in hazy or foggy conditions often suffer from reduced visibility and degraded image quality. Dehazing techniques aim at improving the visibility of these images, allowing for better interpretation and analysis. Improved Object Detection and Recognition: Dehazing infrared images can enhance the performance of object detection and recognition algorithms. By removing the haze, important visual features of objects can be more clearly revealed, leading to more accurate and reliable results in various applications such as surveillance, target tracking, and autonomous vehicles. Enhanced Environmental Monitoring: Infrared imaging is widely used in environmental monitoring, including forest fire detection, air pollution monitoring, and thermal inspection of infrastructure. Dehazing techniques can help improve the accuracy and reliability of these monitoring systems by providing clearer and more detailed infrared images. Enhanced Human Perception: Dehazing infrared images can also benefit human observers by providing clearer and more understandable visual information. This is particularly important in applications where human operators rely on infrared images for decision-making, such as search and rescue operations, firefighting, and security surveillance. Advancements in Computer Vision Research: Dehazing infrared images presents a challenging problem in computer vision research. Developing effective dehazing algorithms for infrared images requires the exploration and development of novel techniques, such as image enhancement, deconvolution, and scene understanding. The research in this area can contribute to the advancement of overall computer vision research and benefit other related fields.   Progress   In recent years, with the continuous development of computer vision and deep learning technologies, significant progress has been made in infrared image dehazing techniques, providing support for the development of infrared image applications. According to the different types of data relied upon in the process of infrared image dehazing, existing methods can be divided into two categories: multi-information fusion and single-frame image processing. Image dehazing is a highly challenging task because the degradation level of an image is influenced by factors such as the concentration of suspended particles and the distance between the target and the detector. These pieces of information are difficult to directly obtain from the image, making image dehazing a very challenging task. Researchers have proposed multi-information fusion algorithms to assist in the restoration of infrared images by fusing additional information acquired through sensor fusion or multiple images. These methods mainly include polarization image dehazing (Fig.2) and fusion-weighted image dehazing methods. Single-frame image processing refers to the technique of digital or image processing applied to individual static images. In practical applications, single-frame image processing is often combined with machine learning, deep learning, and other technologies to achieve better results. This article mainly discusses image enhancement and image reconstruction in single-frame image processing. Image enhancement combines the MSR (Fig.5) with the CLAHE algorithm to achieve image enhancement of foggy images (Fig.3, Fig.4). Image reconstruction applied to the field of infrared image dehazing can estimate unknown information based on the characteristics of known information, which can be used to restore the degraded image quality caused by haze conditions. The main methods include: Dark Channel Prior, Super pixel and MRF (Fig.7), Atmospheric Light Estimation-based (Fig.8), Color Attenuation Prior-based (Fig.9), Detail Transmission Prior-based Image, Gradient Channel Prior-based Dehazing Algorithm. Overall, both multi-modal fusion and single-frame image processing approaches contribute to the advancement of infrared image dehazing techniques by leveraging different types of data and image processing algorithms.   Conclusions and Prospects   Infrared image dehazing technology will become more intelligent. Researchers are more inclined to use deep learning and convolutional neural network (CNN) techniques to achieve automated haze removal processing. In the future, infrared image dehazing technology is expected to be deeply integrated with other image processing techniques. Multi-modal fusion is a technique used to extract the most useful information from multiple data sources in order to improve the understanding and processing of image data, to enhance image quality and processing efficiency. To improve the accuracy of infrared image dehazing, it can be beneficial to incorporate visible light images or depth images.
Infrared image super-resolution based on spatially variant blur kernel calibration
Cao Junfeng, Ding Qinghai, Luo Haibo
2024, 53(2): 20230252. doi: 10.3788/IRLA20230252
[Abstract](164) [FullText HTML] (26) [PDF 4486KB](43)
  Objective  In recent years, infrared imaging systems have been increasingly used in industry, security, and remote sensing. However, the resolution of infrared devices is still quite limited due to its cost and manufacturing technology restrictions. To increase image resolution, deep learning-based single image super-resolution (SISR) has gained much interest and made significant progress in simulated images. However, when applied to real-world images, most approaches suffer a performance drop, such as over-sharpening or over-smoothing. The main reason is that these methods assume that blur kernels are spatially invariant across the whole image. But such an assumption is rarely applicable for infrared images, whose blur kernels are usually spatially variant due to factors such as lens aberrations and thermal defocus. To address this issue, a blur kernel calibration method is proposed to estimate spatially-variant blur kernels, and a patch-based super-resolution (SR) algorithm is designed to reconstruct super-resolution images.   Methods  Parallel light tube and motorized rotating platform are used to establish target image acquisition environment, and then images of multi-circle target at different positions are gathered (Fig.1). Based on sub-pixel accurate circle center detection, the camera pose parameters are solved, and high-resolution target images are synthesized according to the parameters. High-resolution and low-resolution target image pairs are fed into the blur kernel estimation network to obtain accurate blur kernels (Fig.3). In addition, a patch-based super-resolution algorithm is designed, which decomposes the test image into overlapping patches, reconstructs each of them separately using estimated kernels, and finally merges them according to Euclidean distances (Fig.4).   Results and Discussions   The experimental results show that the blur caused by the optical system is not negligible and varies slowly with spatial position (Fig.6). The proposed method, which calibrates blur kernels in a laboratory setting, can obtain a more accurate blur kernel estimation result. As a consequence, the proposed patch-based super-resolution algorithm can produce more visually pleasant results with more reliable details (Fig.7-8), and can also boost objective quality evaluation indicators such as natural image quality evaluator (NIQE), perception based image quality evaluator (PIQE), and blind/referenceless image spatial quality evaluator (BRISQUE) (Tab.1). SR experiments on 4-bar targets with different spatial frequencies show that the proposed method can distinguish the target with spatial frequency of 3.57 cycles/mrad, while comparison methods can just distinguish that of 3.05 cycles/mrad under the same conditions (Fig.9).   Conclusions  A blur kernel calibration method is proposed to estimate spatially-variant blur kernels, and a patch-based super-resolution algorithm is designed to implement super-resolution reconstruction. The experimental results show that image blur caused by the optical system changes slowly with the spatial position. As a result, one blur kernel can be estimated for each image patch, instead of densely estimated for each pixel, thereby reducing the complexity of calibration and memory consumption during reconstruction. Thanks to the accurate blur kernel estimation, the proposed super-resolution algorithm outperforms the comparison methods in both qualitative and quantitative results. Furthermore, the blur kernel calibration method is easy to implement in engineering applications. For any infrared camera, only dozens of multi-circle target images covering all areas of the focal plane are needed to complete the calibration process. When real-time performance is required, the proposed blur kernel calibration method can also be combined with other lightweight non-blind super-resolution methods to achieve a real-time performance. In the future, the problem of image blur caused by thermal defocusing will be studied to expand the scope of the method.
Two-step random phase-shifting algorithm based on principal component analysis and VU decomposition method
Zhang Yu
2024, 53(2): 20230596. doi: 10.3788/IRLA20230596
[Abstract](100) [FullText HTML] (22) [PDF 2149KB](22)
  Objective  The level of optical metrology determines the level of optical manufacturing technology, and the phase-shifting interferometry (PSI) as an easy, high-speed and accurate optical testing tool is usually used during or after optical fabrication. Both accuracy and efficiency are important to PSI. Outstanding phase-shifting algorithms (PSAs) can reduce the requirements for the interferometer hardware and environment, and further improve the accuracy and speed of PSI. Traditional PSAs with known phase shifts are easily affected by the miscalibration of piezo-transducer and environmental errors. In order to save time, many single-step PSAs were developed. Nevertheless, the sign of phase is difficult to judge by only one interferogram. In some high-precision events, accurate phase reconstruction is of interest. Hence, the multi-step PSAs with more than three interferograms were developed. However, it's difficult to reconstruct the phase with high accuracy and efficiency simultaneously. Comparatively, two-step random PSAs can avoid the effect of phase shift error, solve the sign ambiguity problem of the single-step PSAs, and balance the accuracy and speed. However, general two-step random PSAs need pre-filtering or use some complex methods to calculate background, these methods will cost more time. To balance the computational time and accuracy, a fast and high-precision two-step random phase-shifting algorithm based on principal component analysis and VU decomposition method is proposed in this paper.   Methods  A two-step random phase-shifting algorithm based on principal component analysis and VU decomposition method is proposed in this paper. Firstly, two-step principal component analysis method is used to calculate the initial phase of iteration by two filtered phase-shifting interferograms, and then VU decomposition and iteration of two unfiltered phase-shifting interferograms are used to calculate the final phase. Finally, the proposed method is compared with four good two-step random phase-shifting algorithms for different fringe types, noise, phase shift values and fringe numbers to verify its superior performance in the computational time and accuracy.   Results and Discussions   Compared with four good two-step random phase-shifting algorithms, the proposed method has the best comprehensive performance for different fringe types, noise, phase shift values and fringe numbers. The proposed method has the highest accuracy. Meanwhile, its effective phase shift range and fringe number range are the largest. When the size of interferograms is 401 pixel×401 pixel, the proposed method takes only 0.035 s more than Gram-Schmidt orthonormalization algorithm and two-step principal component analysis method. Under ideal conditions, the proposed method can get exactly correct result. If high precision is required, it is best to suppress the noise in advance, while setting the phase shift value away from 0 and π, and the fringe number greater than 2.   Conclusions  In order to balance the accuracy and speed of phase calculation, a fast and high-precision two-step random phase-shifting algorithm based on principal component analysis and VU decomposition method is proposed in this paper. The method is characterized by high accuracy, high speed and no filtering. It takes approximately the time of non-iterative algorithm to obtain the accuracy of iterative algorithm, and breaks the limit that iterative algorithm costs more time. It is suitable for high-precision optical in-situ measurement and has wide development future.
Structure characteristics sensing method of unmanned aerial vehicle group based on infrared detection
Xia Wenxin, Yang Xiaogang, Xi Jianxiang, Lu Ruitao, Xie Xueli
2024, 53(1): 20230429. doi: 10.3788/IRLA20230429
[Abstract](62) [FullText HTML] (14) [PDF 2686KB](27)
  Objective  With the rapid development of mobile self-assembling network technology, cooperative control technology, sensing and detection technology, and artificial intelligence technology, unmanned aerial vehicle (UAV) group have gradually shown the characteristics of group intelligence distributed, self-organized and non-cooperative. Timely detection of an attacking UAV group allows for a wealth of countermeasures to be taken effectively. Countermeasures such as navigation deception, physical capture and physical destruction can be taken for a small number of UAV group, but once a large number of UAVs gather to form a UAV group, it is difficult to carry out countermeasures. Therefore, the development of UAV group target detection and identification technology is a prerequisite and key to achieving anti-UAV battlefield situational awareness. The existing target detection algorithms that do not consider the interrelationship between UAV group members, are prone to miss detection, mis-detect group members and fail to sense the structural characteristics of UAV group, we propose a method to sense the structural characteristics of UAV group based on infrared detection.   Methods  Based on infrared detection and YOLOv5 algorithm, we propose an algorithm for sensing the structural characteristics of UAV group based on infrared detection, called GMR-YOLOv5 algorithm. The algorithm is designed by fusing the Space-to-Depth Non-strided Convolution (SPD-Conv) module with the Channel Attention Net (CAN) module to design the Space to Depth-Channel Attention Net (SD-CAN) module. The SPD-Conv module can convert the UAV features from the spatial dimension to the channel dimension, compared with the channel attention mechanism, which does not focus on the correlation between channels, and the designed SD-CAN module can realize the conversion of target features from the spatial dimension to the channel dimension, and also focus on the UAV features in the channel. Meanwhile, for the problem that the texture features of the UAV group members are not obvious in the infrared images, the Group Members relation (GMR) is constructed. This module makes full use of the structural information of UAV group members such as their positions and bounding box sizes in the infrared image, and incorporates the structural information of UAV group members into the association information between group members. Compared with the existing target detection algorithms, the proposed group membership relationship module in this paper considers the information such as the position and bounding box size of UAV group members in the image. Finally, the two constructed modules are fused to the YOLOv5 base network. The algorithm validation experiments are carried out on the self-built UAV group dataset.   Results and Discussions   Experimental validation was carried out on the constructed Drone-swarms Dataset (Tab.1, Fig.4), and the experimental results showed that the mAP of the GMR-YOLOv5 algorithm proposed in the paper reached 95.9%, which improved the mAP of the original YOLOv5 algorithm by about 7%, effectively improving the detection accuracy of UAV group members (Tab.4). Meanwhile, the detection speed reached 59 FPS, which achieves real-time detection of UAV group targets and perception of UAV group structure characteristics. Compared with the classical detection algorithm, the GMR-YOLOv5 algorithm reduces the cases of missed and false detection of UAV targets (Fig.5-Fig.9). Ablation experiments are also conducted to demonstrate the effectiveness of each part of the improved module. The experimental results show that the proposed algorithm in the paper, although the detection speed is reduced, it has different degrees of improvement in the indexes mAP@0.50, mAP@0.50:0.95 (Tab.5).   Conclusions  We propose an algorithm for sensing the structural characteristics of UAV group based on infrared detection. Firstly, the SPD-Conv module and the CAN module are combined to build a SD-CAN module, which not only converts drone features from the spatial dimension to the channel dimension, but also uses the channel attention mechanism to make the network pay more attention to the features of group in the channel, which improves the detection network's feature extraction ability for UAV group members. Secondly, using the position of group members in infrared image, boundary frame size and other structural information, the proposed GMR module, which generates connections among UAV group members, and then improves the detection and localization ability of the network for UAV group members. Meanwhile, the SIoU loss function is used to accelerate the convergence of the network. Finally, experimental validation is carried out on the UAV group dataset, and finally a network model with mAP of 95.9% and detection speed of 59 FPS is obtained to achieve UAV group structure characteristic sensing.
Image target detection algorithm based on YOLOv7-tiny in complex background
Xue Shan, An Hongyu, Lv Qiongying, Cao Guohua
2024, 53(1): 20230472. doi: 10.3788/IRLA20230472
[Abstract](222) [FullText HTML] (64) [PDF 3859KB](71)
  Objective  Once the "black flying" drone carries items such as bombs, it can pose a threat to people. Target detection of "black flying" drones in complex backgrounds such as parks, amusement parks, and schools is the key to anti-drone systems in public areas. This paper aims to detect small-scale targets in complex background. Because the traditional manual image feature extraction methods are not targeted, time complexity is high, windows are redundant, the detection effect is poor, and the average accuracy is low. The problems of false detection and missing detection will occur when detecting small-scale UAVs in complex background. Therefore, this paper aims to develop a black flying UAV detection model based on deep learning, which is essential for the detection of unmanned aerial vehicles.   Methods  YOLOv7 is a stage target detection algorithm without anchor frame, with high detection accuracy and good inference speed. YOLOv7-tiny belongs to the grain grabbing memory model, with fewer parameters and fast operation, making it widely used in industry. In the backbone network, the built multi-scale channel attention module SMSE (Fig.5) is introduced to enhance the attention of UAVs in complex backgrounds. Between the backbone network and the feature fusion layer, the RFB feature extraction module (Fig.6) is introduced to increase the Receptive field and expand the feature information extraction. In the feature fusion, the small target detection layer is added to improve the detection ability of small UAV targets. In terms of calculating losses, the introduction of SIoU Loss function redefines the penalty index, which significantly improves the speed of training and the accuracy of reasoning. Finally, the ordinary convolution is replaced by the deformable convolution (Fig.7), making the detection closer to the shape and size of the object.   Results and Discussions   The dataset selected in this article is a combination of the self-made dataset (Fig.1) and the Dalian University of Technology drone dataset (Fig.2). The mainly used evaluation indicators are mAP (mean accuracy) and FPS (detection speed), Params (parameter quantity) and GFLOPS (computational quantity) as secondary indicators. Each module was compared with the original algorithm, including attention comparison experiment (Tab.1), RFB module comparison experiment (Tab.2), small target detection layer comparison experiment (Tab.3), Loss function comparison experiment (Tab.4), and deformable convolution comparison experiment (Tab.5). And ablation experiments were conducted (Tab.6), which confirmed the effectiveness and feasibility of the proposed algorithm through mAP comparison, improving accuracy by 6.1%. On this basis, the detection performance of different algorithms was compared (Tab.7), and the generalization of the algorithm was verified on the VOC public dataset (Tab.8).   Conclusions  This article proposes an improved object detection algorithm for anti-drone systems. Through the multi-scale channel attention module, the attention of small targets is enhanced, the fusion RFB increases the Receptive field, adds a small target detection layer to improve the detection ability, and improves the Loss function to improve the training speed and reasoning accuracy. Finally, deformable convolution is introduced to better fit the target size. The improved algorithm has achieved good detection results on different datasets.
High frame rate target tracking method using domestic FPGA
Wang Xiangjun, Zhu Hui
2023, 52(9): 20220905. doi: 10.3788/IRLA20220905
[Abstract](97) [FullText HTML] (16) [PDF 5168KB](45)
  Objective  Target tracking plays an important role in the military, medical and other fields, and Field Programmable Gate Array (FPGA) is widely used in the direction due to its good performance and high flexibility. However, at present, limited to the complexity of high-precision tracking algorithms, most of the target tracking systems are implemented by foreign high-performance chips, which leads to weak autonomy and controllability. If domestic chips are used to achieve target tracking, it will face the problem that there are few IP cores and most modules need to be designed in Verilog. In addition, the feasibility of the research algorithm in other domestic FPGAs needs to be considered. Therefore, the objective is to study a tracking algorithm that is easy to design in Verilog, has generalization, and improves real-time and robustness.  Methods  Template matching is easy to design with pipelines and is selected as the basic algorithm, which is widely used due to its simplicity and accuracy. Among them, the template matching algorithm based on Sum of Absolute Difference (SAD) has no multiplication and division operation, which is suitable for FPGA implementation with limited resources. This tracking algorithm has too strict constraints, which leads to the problem of insufficient robustness. Based on the Sum of Absolute Difference (SAD) similarity measurement method, a method for finding the Sum of Minimum Absolute Difference (SMAD) in the window is proposed. In order to reduce resource usage, the maximum and minimum filtering (Fig.3) is used to preprocess the image data in the window and then the minimum absolute difference is obtained, which reduces the resource consumption of the addition and subtraction of the SMAD method to 31.8%. Moreover, a pyramid-like update strategy (Fig.4) that is easy to implement by FPGA hardware is proposed to better adapt to the scale change of the target. In order to verify the tracking performance of proposed algorithm, Unigroup FPGAs are used to implement it and build a real-time target tracking system (Fig.5).   Results and Discussions   Based on the two indicators of Average Overlap Rate (AOR) and success rate, the algorithm comparison experiment was carried out with the OTB dataset. It verified that the proposed algorithm has certain anti-occlusion and scale adaptability (Fig.10). Compared with the SAD method, the tracking metrics in each scenario are improved. In the scale change and synthesis scenario, after the SMAD method is added to the pyramid-like strategy, its success rate and AOR are improved, which verifies the effectiveness of the pyramid-like update strategy (Tab.2). Compared with the robust DDIS algorithm, the proposed method improves the average success rate and overlap rate by 1.18% and 0.13%, respectively, and is easier to design with FPGA. Then the target tracking system is implemented by domestic FPGAs. The delay time is 16 line synchronization cycles plus 37 clock cycles and tracking frame rate can reach 100 frames per second. Different outdoor backgrounds were selected to test the tracking of the system when the target changed in scale, direction of motion and speed (Fig.12). The anti-occlusion test experiment of the tracking platform shows that (Fig.13) when the target is partially occluded, the tracking system can still successfully track the target, which further verifies the feasibility of the proposed method.  Conclusions  In order to break the technology monopoly and improve the autonomy of target tracking application products, the traditional SAD template matching method is improved considering the limitations of domestic FPGA and the performance of tracking algorithms. The SMAD method is proposed and its resource consumption is optimized. Combined with a pyramid-like template update strategy, its tracking performance is improved. The experiments of OTB dataset and domestic tracking system verify its tracking effect. It provides a reference scheme for the localization of high frame rate target tracking system.
Denoising algorithm for space target event streams based on event camera
Zhou Xiaoli, Bei Chao, Zhang Nan, Xin Xing, Sun Zuoming
2023, 52(9): 20220824. doi: 10.3788/IRLA20220824
[Abstract](162) [FullText HTML] (51) [PDF 2780KB](42)
  Objective   Event camera can capture the real-time changes of the scene. It only outputs brightness changes of the pixel level and asynchronous event stream with microsecond resolution. It has the advantages of high event resolution, high dynamic range, low delay and low bandwidth. Its application in the space target detection has gradually attracted the attention of researchers. At present, there are following challenges in the application of event camera. On the one hand, event camera is sensitive to environmental changes and outputs a lot of noise. On the other hand, the remote detection of space target will output the point-target event stream, resulting in a low signal-to-noise ratio, which demands higher requirements for the processing algorithm of space event stream. Therefore, denoising algorithm for space target event streams is very important for data preprocessing. For this purpose, denoising algorithm based on event camera is proposed.  Methods   For event stream data of space targets, this paper proposes the Neighborhood Density-based Spatiotemporal Event Filter (NDSEF), which is based on the neighborhood density, to reduce the local spatial neighborhood noise of each time neighborhood by compressing the image frame. Combined with the characteristics of space target trajectory, a circular local sliding window is set to adjust the selection range of spatial neighborhood, and noise filtering based on spatial information is realized (Fig.3). On this basis, this paper proposes a cascade filter based on NDSEF for different scenes and targets in the space environment. Through multiple stages of increasing the cumulative window of pixel dimensions, the multi-dimensional combination filter can gradually refine the event data and obtain the best noise reduction performance.  Results and Discussions   This paper demonstrates the high-speed and high-generalization ability of the denoising algorithm in the public datasets and the simulation datasets. The scene information of the experimental datasets is shown (Tab.1-2), including three single-target scenes, a double-targets scene and a simulated space scene. The proposed filter outperforms the classical filter in signal-to-noise ratio and noise ratio (Fig.6, Tab.3), and the event processing speed can reach 10 μs, which meets the requirement of real-time detection of the space targets. Meanwhile, noise events can be effectively processed for multi-target event streams (Fig.5, Fig.7). The experimental results show that the proposed filter can ensure the accuracy and processing speed of the denoising algorithm in the space scene with low SNR.  Conclusions   This paper introduces denoising algorithm for space target event streams based on event camera, namely NDSEF algorithm, which makes full use of spatio-temporal constraints and signal characteristics of low signal-to-noise ratio. By compressing image frames, local spatial neighborhood denoising is processed for each time neighborhood. By combining space target trajectory characteristics, circular local sliding window is set to adjust the selection range of space neighborhood. On this basis, the cascaded filter based on NDSEF is proposed to increase the accumulation of pixel dimension windows to achieve a high degree of the algorithm optimization. The experimental results show that the proposed filter has obvious effect on denoising, the target signal is clearly visible. The signal-to-noise ratio and noise ratio are significantly improved, and the event processing speed is up to 10 μs. For space multi-target event streams under extreme conditions, it has the advantages of accuracy and real-time, which lays the foundation for the space multi-target detection based on event camera.
Infrared time-sensitive target detection technology based on cross-modal data augmentation
Wang Siyu, Yang Xiaogang, Lu Ruitao, Li Qingge, Fan Jiwei, Zhu Zhengjie
2023, 52(9): 20220876. doi: 10.3788/IRLA20220876
[Abstract](159) [FullText HTML] (43) [PDF 3544KB](64)
  Objective  Infrared time-sensitive targets refer to infrared targets such as ships and aircraft, which have high military value and the opportunity of attack is limited by the time window. Infrared time-sensitive target detection technology is widely used in military and civilian fields such as unmanned cruise, precision strike, battlefield reconnaissance, etc. The target detection algorithm based on deep learning has made great progress in the field of target detection due to its powerful computing power, deep network structure and a large number of labeled data. However, the acquisition of some high-value target images is difficult and costly. Therefore, the infrared time-sensitive target image data is scarce, and the multi-scene and multi-target data for training is lacking, which makes it difficult to ensure the detection effect. Based on this, this paper proposes an infrared time-sensitive target detection technology based on cross-modal data enhancement, which generates "new data" by processing the data, expands the infrared time-sensitive target data set, and improves the model detection accuracy and generalization ability.   Methods  We propose an infrared time-sensitive target detection technology based on cross-modal data enhancement. The cross-modal data enhancement method is a two-stage model (Fig.1). First, in the first stage, the visible light image containing time-sensitive targets is converted into infrared images through the mode conversion model based on the CUT network, and then the coordinate attention mechanism is introduced into the second stage model to randomly generate a large number of infrared target images, realizing the data enhancement effect. Finally, an improved Yolov5 target detection architecture based on SE module and CBAM module is proposed (Fig.3).   Results and Discussions   The proposed cross-modal infrared time-sensitive target data enhancement method combines the style migration model with the target generation model, and uses the visible light image data set to achieve infrared time-sensitive target data enhancement. We can convert remote sensing visible image into infrared image without losing size, structure and field of view, without distortion, noise, distortion and other problems. It can be seen from Fig.6 that the generated infrared time-sensitive target has good texture details and infrared characteristics, and is clearly distinguished from the background. An improved Yolov5 target detection model is proposed. SE and CBAM attention mechanisms are added to the CSP network to enhance the feature expression of the network and better achieve infrared time-sensitive target detection. It can be seen from the analysis of Tab.2 that compared with using the original data to train the deep learning detection network, the data enhancement algorithm proposed in this paper has significantly improved the detection ability of positive samples, the detection accuracy rate, the recall rate, and the average accuracy have increased by 14.57%, 5.99%, and 8.82% respectively. It can be seen from Tab.3 that compared with SSD, Fast R-CNN and Yolov5, the algorithm in this paper has a great improvement in accuracy, average accuracy and F1 index. Compared with the original Yolov5 network, the accuracy rate, the recall rate, the average accuracy, and the F1 index have increased by 7.36%, 5.43%, 2.74%, and 6.45% respectively. Some test results are shown (Fig.9).   Conclusion  Due to the lack of infrared time-sensitive target data and poor detection effect, we proposes a cross-modal data enhancement infrared time-sensitive target detection technology. In the aspect of two-stage model data enhancement, firstly, the visible light remote sensing image containing time-sensitive targets is converted into the target image with infrared characteristics using the mode conversion network. Secondly, the coordinate attention mechanism is introduced into the sample random generation model. Finally, the Yolov5 detection technology based on the improved CSP module is proposed. Multiple sets of experimental results show that the detection accuracy of the algorithm in this paper is up to 98.06% in the infrared time-sensitive target data set, which solves the problem of the lack of infrared time-sensitive target data and has good target detection ability.
An infrared vehicle detection method based on improved YOLOv5
Zhang Xuezhi, Zhao Hongdong, Liu Weina, Zhao Yiming, Guan Song
2023, 52(8): 20230245. doi: 10.3788/IRLA20230245
[Abstract](208) [FullText HTML] (82) [PDF 2611KB](73)
  Objective  Infrared image technology is capable of working in low-light and adverse weather conditions. Infrared vehicle detection technology is designed to use infrared sensors to monitor vehicles on roads, enabling the collection and analysis of information related to vehicle quantity and speed, which can be used to achieve traffic management and safety control. This technology can be applied not only to road vehicles, but also to rail transport, airports, and ports, providing effective technical support for the safety and convenience of the transportation industries. However, infrared vehicle detection still faces many challenges due to the low resolution, low contrast, and blurred edges of small targets in infrared images. Traditional hand-crafted image feature extraction methods are not adaptable nor robust, require substantial prior knowledge and have low efficiency. Therefore, this paper aims to explore deep learning-based vehicle detection models, which plays an important role in traffic regulation.   Methods  YOLOv5 is a one-stage object detection algorithm that is characterized by its lightweight design, ease of deployment, and high accuracy, making it widely used in industrial applications. In this paper, a CFG mixed attention mechanism (Fig.2) is introduced into the model backbone to help the model better locate the vehicle area in the image and improve its feature extraction ability, due to the low resolution of infrared images. In the feature fusion part, an improved Z-BiFPN structure (Fig.5) is proposed to incorporate more information in the shallow fusion, thereby improving the utilization of shallow information. A small object detection layer is added, and the Decoupled Head (Fig.6) is used to separate classification and regression, improving the model's ability to detect small target vehicles.   Results and Discussions   In order to improve the model's generalization ability, an infrared image dataset INFrared-417 (Fig.7) consisting of seven categories of bus, truck, car, van, person, bicycle and elecmot, was constructed by collecting data and combining existing infrared datasets. The main evaluation metrics used were AP (Average Precision) and mAP (mean Average Precision), with P (Precision) and R (Recall) as secondary metrics for the experiments. The ablation experiment results (Tab.1) confirmed the effectiveness and feasibility of the proposed improvement methods, with mAP improving by 4.0%, and AP significantly improving for the van, person, and bicycle categories, while P increased by 1.7% and R increased by 3.6%. In addition, the comparison results (Fig.10) demonstrated that the improved model reduced false alarm and missed detection rates, while improving the detection of small targets. The comparison experiment results (Tab.2) also showed that the proposed improved model had excellent performance in terms of detection accuracy and model parameter count.   Conclusions  This paper proposes an improved infrared vehicle detection algorithm. By introducing the mixed attention mechanism, the model is able to better focus on the vehicle region in the image and enhance its feature extraction ability. The improved Z-BiFPN is used in the model neck to efficiently integrate context information. At the same time, the detection head is replaced with a more advanced Decoupled Head to improve the detection ability, and a small object detection layer is added to improve the ability to capture small targets. It is hoped that this model can be applied in traffic control.
A study on the uniform distribution and counting method of raw cow's milk somatic cells based on microfluidic chip
Zhou Wei, Wang Minghui, An Guangxin, Zheng Hongbiao, Li Xingyu, Meng Qingyi
2023, 52(8): 20230265. doi: 10.3788/IRLA20230265
[Abstract](162) [FullText HTML] (39) [PDF 3998KB](25)
  Objective  The somatic cell count (SCC) in raw milk is an important basis for determining whether a cow is suffering from mastitis. Identifying cows with mastitis by testing the SCC, then isolating and treating them as early as possible, can effectively prevent the spread of bacteria in the herd to reduce consequential economic losses. However, traditional methods may lead to uneven distribution of somatic cells during milk sampling, such as cell adhesion settlement, and unrepresentative somatic cell count due to lack of matching imaging system. In this paper, a method is proposed which is based on the nine-cell grid microfluidic chip to make somatic cell evenly distributed and develop a two degree of freedom displacement platform equipped with a micro lens to improve the counting accuracy .   Methods  Firstly, a simulation was performed to verify the uniformity of the somatic cell distribution within the chip observation cavities (Fig.1). And based on the simulation results, a nine palace grid microfluidic chip was prepared (Fig.2). Secondly, a two-degree-of-freedom displacement platform (Fig.6) equipped with a micro-camera lens is developed, which can automatically take images of the nine observation cavities of the chip, making image acquisition more convenient. Finally, somatic cells were counted by image processing (Fig.3), so as to verify the uniformity of somatic cell distribution, obtain the counting accuracy, and judge the health status of cow udder.  Results and Discussions   20 cows were randomly selected from local pastures to verify the performance of the proposed method. From the data in Tab.1, it can be seen that the standard deviation coefficient of the SCC in each group of nine images is less than or equal to 1.61%, which verifies the uniformity of the distribution of somatic cells in the nine observation cavities of the microfluidic chip. The system ensures the uniform distribution of somatic cells and renders the taken samples more representative. As can be seen from Fig.9(b), the maximum absolute value of the relative error of the system of automatic counting is 2.93%, the minimum value is 0.53%, and the average value is 1.72%. The maximum relative count error of the automatic counting is obtained as ±2.93%. The system has a very high accuracy for automatic counting.   Conclusions  The experimental results show that the somatic cell counting system developed in this paper can make the somatic cell distribution in fresh milk more uniform and count more accurately. The standard deviation coefficient of the number of somatic cells in each group of nine images was less than or equal to 1.61%, and the smaller the standard deviation coefficient is, the more uniform the distribution of somatic cells is. The accuracy of the automatic system counts ranged between 97.07% and 99.47%. This research method lays the foundation for the detection and prevention of mastitis in cows.
Cross-modal geo-localization method based on GCI-CycleGAN style translation
Li Qingge, Yang Xiaogang, Lu Ruitao, Wang Siyu, Fan Jiwei, Xia Hai
2023, 52(7): 20220875. doi: 10.3788/IRLA20220875
[Abstract](164) [FullText HTML] (43) [PDF 4107KB](33)
  Objective   The purpose of this research is to propose a cross-modal image geo-localization method based on GCI-CycleGAN style translation for vision-based autonomous visual geo-localization technology in aircraft. The technology is essential for navigation, guidance, situational awareness, and autonomous decision-making. However, existing cross-modal geo-localization tasks have issues such as significant modal differences, complex matching, and poor localization robustness. Therefore, real-time infrared images and visible images with known geo-location information are acquired with the proposed method, and a GCI-CycleGAN model is trained to convert visible images into infrared images using generative adversarial network image style translation. The generated infrared images are matched with real-time infrared images using various matching algorithms, and the position of the real-time infrared image center point in the generated image is obtained through perspective transformation. The positioning point is then mapped to the corresponding visible image to obtain the final geo-localization results. The research is crucial as it provides a solution to the challenges faced by existing cross-modal geo-localization tasks, improving the quality and robustness of geo-localization outcomes. A higher matching success rate and a more accurate average geo-localization error are achieved with the GCI-CycleGAN and DFM intelligent matching algorithms. The proposed method has significant practical implications for vision-based autonomous visual geo-localization technology in aircraft, which plays a crucial role in navigation and guidance, situational awareness, and autonomous decision-making.  Methods   The research describes a proposed method for cross-modal image geo-localization based on GCI-CycleGAN style translation (Fig.1). First, the real-time infrared and visible light images of the drone's direct down view aerial photography are obtained (Fig.10). The GCI-CycleGAN model structure (Fig.3) and the generated confrontation loss function were designed and trained on the RGB-NIR scene dataset (Fig.5). The trained GCI-CycleGAN model is utilized to perform style transfer on visible light images, resulting in more realistic pseudo infrared images (Fig.8). Using various matching algorithms, including SIFT, SURF, ORB, LoFTR (Fig.6), and DFM (Fig.7), the generated pseudo infrared image is matched with the real-time infrared image to obtain the feature point matching relationship (Fig.9). The homography transformation matrix is determined based on the matching relationship of feature points. Based on the homography transformation matrix, perspective transformation is performed on the center point of the real-time infrared image to determine the pixel points corresponding to the center point in the pseudo infrared image. Then the pixel points corresponding to the center point in the pseudo infrared image are mapped to the visible light image, and the mapping points in the visible light image are determined (Fig.11). Finally, based on the geographic location information corresponding to the mapping points in the visible light image, the geographic positioning results of the drone are obtained (Fig.12).  Results and Discussions   The experiment results demonstrate that compared to CycleGAN, GCI-CycleGAN pays more attention to the expression of detailed texture features, generates infrared images without distortion, and is closer to the target infrared image in brightness and contrast, effectively improving the quality of image style translation (Tab.1). The combination of GCI-CycleGAN and DFM intelligent matching algorithm can achieve a matching success rate of up to 99.48%, 4.73% higher than the original cross-modal matching result, and the average geo-localization error is only 1.37 pixel, achieving more accurate and robust geo-localization outcome.  Conclusions   This article studies the geographic positioning problem of cross-modal image matching through style translation between infrared and visible light images captured by aircraft aerial photography. A cross-modal image intelligent matching method based on GCI-CycleGAN is proposed, which combines generative adversarial networks with matching algorithms to solve the geographic positioning problem based on visible light and infrared aerial image matching. First, a new loss function is designed to construct a GCI-CycleGAN model to transfer the style of visible images, and then LoFTR and DFM intelligent matching algorithms are used to achieve effective matching between the generated image and real-time infrared images. Finally, the matching relationship is mapped to the original cross-modal image pair to obtain the final geographical positioning result. The experimental results show that the proposed method effectively achieves cross-modal transformation of images and significantly improves the success rate of matching algorithms, demonstrating the value and significance of this geographic positioning method. In the future, how to deploy the proposed algorithm in embedded edge computing devices and balance cost, power consumption, and computing power to make the algorithm meet the effectiveness and real-time is a challenging problem in current practical engineering applications.
Research progress of laser dazzle and damage CMOS image sensor (invited)
Wen Jiaqi, Bian Jintian, Li Xin, Kong Hui, Guo Lei, Lv Guorui
2023, 52(6): 20230269. doi: 10.3788/IRLA20230269
[Abstract](335) [FullText HTML] (99) [PDF 2874KB](112)
  Significance  Complementary Metal Oxide Semiconductor (CMOS) image sensors are currently the most mainstream solid-state image sensors. They have the characteristics of low power consumption, high integration, and fast imaging. In the past decade, breakthroughs have been continuously made in their performance, surpassing Charge Coupled Device (CCD) image sensors in market share and product iteration speed. It is widely used in the fields such as digital cameras, security monitoring equipment, mobile phones, drones, medical detection, and autonomous driving. As the core component of an optoelectronic imaging system, image sensors strongly absorb laser energy within their working wavelength, making them more susceptible to laser damage compared to other components of the optoelectronic system. However, the new back side illumination CMOS and stacked CMOS have significant structural differences from traditional front side illumination CMOS image sensors, and their ability to resist laser interference and damage has been greatly improved. Therefore, the laser interference effect and damage mechanism of CMOS image detectors have received widespread attention from scholars at home and abroad.  Progress  Firstly, the structure and working principle of CMOS image sensor according to its development history are introduced. The pixel structure of CMOS has evolved from passive pixels to active pixels, where each pixel can independently collect, amplify, and output signals. SiO2 deep trench isolation (DTI) structure (Fig.3(d)) is used between pixels for crosstalk suppression. The chip structure of CMOS has evolved from front illuminated to back illuminated and stacked, and the position of the metal wiring layer has buried deeper, making it more difficult to cause destructive damage. On this basis, the weaknesses of CMOS image sensor in the process of laser irradiation are briefly analyzed. CMOS uses a correlated double sampling (CDS) circuit to output signals, which uses the difference between two signals to output, and interfering with both signals causes pixel oversaturation; The use of the same column line to transmit the reference signal of a column of pixels provides the possibility of large-scale crosstalk. The damage at different stages is related to the depth of laser action. It can be concluded that the key to causing large-scale damage to CMOS image sensors is the severe damage to the internal circuit layer.  CMOS image sensor is used more and more widely. More attention has been paid to the experimental study of laser-induced dazzle and damage of CMOS. The evaluation methods of interference and the main measurement methods of damage threshold are summarized. The representative measurement results of interference and damage threshold are summarized (Tab.1-2). By comparing the results of interference, the conditions of oversaturation and crosstalk are summarized, and the conclusion is verified that the above-mentioned CDS circuit is susceptible to interference. Compared with CCD, CMOS has better anti-damage ability, especially the back-illuminated CMOS, which is difficult to cause large area damage. This is because the back-lit CMOS circuit layer is deeper, above a thicker layer of silicon-based material, forming a certain inherent protective layer. With the wide application of backlit and stack CMOS chips, how to improve the damage efficiency of laser-illuminated CMOS chips is an urgent problem to be solved in the next research.  Finally, the development status and prospects of using new laser systems to improve the damage ability of CMOS image sensors are discussed. The composite laser can be made up of two pulses with different parameters. The ablation and damage of the composite laser on the single material target has been well studied. If the laser parameters are matched properly, the absorption rate of laser energy can be improved effectively. It has been proved that the composite laser can improve the efficiency of damaged CMOS to some extent, but the effect is limited. To further improve the laser damage efficiency, we can consider to further increase the adjustable parameters of the laser, the combination of three or more pulses into the pulse string form.  Conclusions and Prospects   CMOS image sensors are booming, which have become the most mainstream image sensors. As an important countermeasure, the research of laser jamming and damage CMOS image sensor needs to be further explored. The purpose of this paper is to provide some references for the future research of laser jamming and damage CMOS, and the idea of using the new laser system to improve the damage efficiency is proposed.
Irradiation effect of 2.79 μm mid-infrared laser on CMOS image sensor
Wang Xi, Zhao Nanxiang, Zhang Yongning, Wang Biyi, Dong Xiao, Zou Yan, Lei Wuhu, Hu Yihua
2023, 52(6): 20230168. doi: 10.3788/IRLA20230168
[Abstract](198) [FullText HTML] (63) [PDF 1394KB](60)
  Objective  The CMOS image sensors are widely used in aerospace, security monitoring, industrial control, navigation and guidance, image recognition systems and other fields. Most of researches on laser irradiation effect of CMOS image sensor mainly focus on visible and near infrared bands. With the application of more and more lasers with different wavelengths, there is a great risk of damage to optical systems irradiated by out-of-band lasers, and it is necessary to conduct systematic experimental studies on the interaction between out-of-band lasers and photo detectors. In the photoelectric countermeasure, it is very important to study whether the interference and damage can be effectively caused to the detector when the interference and damage are irradiated by the out-of-band laser, and what its mechanism is. The wavelength of 2.79 μm mid-infrared laser is in the atmospheric window, which has the characteristics of small air scattering and long propagation distance. This band is also the working band of most reconnaissance satellites, surveillance satellites, early warning satellites and other space-based systems. In the future space applications, the high-power 2.79 μm mid-infrared laser has a broad application prospect. Therefore, it is of great reference value in the laser attack and defense field to study the irradiation effect of mid-infrared laser on CMOS image sensor.   Methods  In the experiment, the CMOS image sensor irradiated by 2.79 μm mid-infrared laser is carried out (Fig.1). The computer is connected to the output signal of CMOS image sensor to observe and record the effect of laser irradiation. In order to study the damage effect of laser irradiation on CMOS image sensor, the experiment is divided into two stages. In the first stage, the laser energy is directly irradiated on the sensor without the sapphire focus lens, and the interference effect of 2.79 μm mid-infrared laser on CMOS image sensor is studied. In the second stage, the sapphire focus lens is placed in the optical path to study the damage effect of 2.79 μm mid-infrared laser on CMOS image sensor. The differential interference contrast (DIC) microscope is used to observe the damage morphology of CMOS sensor samples.  Results and Discussions  The experimental results of laser interference show that saturation and oversaturation appears on the CMOS image sensor with the increase of laser energy (Fig.3). After stopping laser irradiation for a period of time, CMOS can automatically return to the normal working state. The experimental results show that with the increase of the repetition frequency, CMOS image sensor needs less laser energy and less time to achieve saturation or oversaturation of full screen (Fig.5). The experimental results of laser damage show that the phenomenon of saturation, oversaturation, black screen, green screen and bright line are observed with different laser repetition frequency (Fig.8-9). The damage morphology shows that obvious melting damage occurs in the irradiation area of laser spot, and the high laser energy in the center of beam leads to the ablation and evaporation of a large area of pixel material, and the periphery of the spot area is obviously heated, but no cracks appear (Fig.10). It shows that the damage of 2.79 μm mid-infrared laser on CMOS sensor is mainly due to the thermal melting of materials, and the thermal effect is obvious.   Conclusions  The experimental results indicate that the CMOS image sensor has good anti-interference and anti-damage ability. The damage thresholds of CMOS image sensor irradiated by 2.79 μm mid-infrared laser at a 10 Hz pulse repetition frequency are 0.44 J/cm2 for saturation, 0.97 J/cm2 for oversaturation, and 203.71 J/cm2 for damage, respectively. It can be seen that the damage threshold of the CMOS image sensor is much higher than its interference threshold. The experimental results show that the damage mechanism of CMOS image sensor is mainly melting damage, and the thermal effect is obvious.
"Cat's eye" echo information assessment method of CCD damage status
Yu Ting, Niu Chunhui, Lv Yong
2023, 52(5): 20220537. doi: 10.3788/IRLA20220537
[Abstract](133) [FullText HTML] (35) [PDF 1926KB](36)
  Objective   Charge Coupled Devices (CCD) is a common photoelectric sensor for acquiring image information in photoelectric warfare. In photoelectric warfare, active detection, optical performance analysis and damage status assessment of enemy CCD device are the prerequisites for effective implementation of photoelectric warfare. At present, there are few studies on CCD damage status and damage grade assessment based on the detection echo information, and the actual assessment is affected by the complex environment. The CCD damage status has a complex nonlinear relationship with the "cat's eye" echo intensity and polarization degree which can’t correctly judge whether the CCD is damaged or not based on the intensity and polarization value alone. Therefore, it is considered to use multi-source information fusion method to carry out research on CCD damage status assessment, that is, combining the characteristic information of multiple CCD to obtain the optimal estimation.  Methods  Combined with multi-source information fusion technology and machine learning, three models of KNN, K-SVM and PNN suitable for nonlinear data classification and discrimination are used to study the assessment method of CCD damage status. Among the three assessment methods, the KNN method uses the category of the proximity point to predict the category, the K-SVM method uses the hyperplane to predict the category and the PNN method uses a posterior probability density to predict categories.  Results and Discussions   The near- and long-distance "cat's eye" echo detection experiments were carried out respectively, and the echo intensity, polarization degree information and CCD actual damage information were used as input data to train the three models respectively (Tab.3), and the assessments of the three models were compared including the number of errors in the assessment points, the error rate and the assessment time (Fig.5-6), which show that the error rates of KNN and K-SVM fluctuate within 4%, and the error rate of PNN fluctuates within 2% during the five random test sets. The selection of test sets has a great impact on the KNN and K-SVM, but the error rate of PNN is relatively stable which does not affect the PNN. The assessment effect of different scenarios is compared by using the average value of the results of five random test sets (Tab.4), and the near-distance experiment assessment effect is close, with the average error rate of 2%-3%; the average error rate of long-distance experiment assessment is 7%-12%, in which the average error rate is the lowest and prediction time is short, but the stability is not good as K-SVM; the average error rate of mixed data assessment is 10%-14%, in which PNN has the lowest average error rate and good stability, but the prediction time is about twice that of other methods.  Conclusions   PNN model with the optimal smoothing factor had the lowest error rate in the complex outdoor environment, considering the allowable time range of the actual assessment, the PNN model was most suitable for use based on application of CCD damage status assessment of "cat's eye" echo information. The PNN model has better comprehensive assessment effect than the other two methods and has the best stability in the comprehensive environment. The research results are an exploration of laser damage status assessment, which is conducive to improving the assessment ability of the detection target and the intelligent degree of the system, and provides a new idea for the non-contact laser active detection and assessment technology and improving the defense and strike ability of the weapon system.
Research on improved tracking feedforward control method based on sensor fusion prediction
Li Hang, Peng Gaoliang, Lin Hongzhao, Chen Zhao
2023, 52(5): 20220665. doi: 10.3788/IRLA20220665
[Abstract](94) [FullText HTML] (28) [PDF 3333KB](33)
  Objective  Photoelectric tracking system (Acquisition, Tracking, and Pointing, ATP) is a kind of equipment that uses photoelectric technology to realize the pointing and tracking of the target. It has the characteristics of high measurement and tracking accuracy. The existing ATP system usually carries precise optical systems and detectors, which can accurately locate, track and aim the target. For high-speed target tracking system, the time delay of sensor feedback such as image becomes the main factor that restricts the upper limit of tracking speed of the system. The delay link of system feedback has become the bottleneck restricting the improvement of ATP system's tracking ability. Therefore, an improved tracking feedforward control method is proposed based on sensor fusion prediction to solve the problem of ATP tracking high-speed targets.  Methods  Firstly, the CCD and high-precision encoder are fused with sensor data, and the target motion state is tracked according to the differential tracking principle to obtain the high-order information of the target motion, and the noise caused by the difference is greatly reduced. Secondly, a reduced-order CA model is proposed to reduce the computation and estimation parameters, and compensate the pure delay link of miss distance according to the Kalman filter principle to obtain the low-delay target motion state information. Thirdly, the least-squares polynomial fitting is performed only by combining the results of the previous moment, which avoids the problem of ill conditioned matrix in the least-squares, and can greatly reduce the calculation amount of fitting, and realize the expansion of CCD feedback from low frequency signal to high frequency signal. Finally, according to the prediction results and higher-order motion information, a tracking feedforward control loop is designed to improve the response speed and tracking ability of the system.  Results and Discussions  A new control method for ATP system to track high-speed targets is proposed. The high-order motion information of the target is obtained through sensor fusion, and the Kalman prediction based on the reduced-order CA model is carried out. The input deviation after prediction compensation is shown (Fig.12), and the error is reduced by about 88.22%; Combining the least-squares fitting at the previous moment, the problem of ill conditioned matrix in the least squares is avoided, and the expansion of data signal is realized to ensure the data stability of the system.  Conclusions  An improved tracking feedforward control method is proposed based on sensor fusion prediction, aiming at the problem that the feedback frame rate of CCD camera in the photoelectric tracking system is low and the delay is large, resulting in poor tracking ability and response ability of high-speed targets. The simulation results and experimental results show that the tracking error caused by image lag can be greatly reduced without changing the closed-loop stability of the control system when tracking high-speed targets. The actual test results show that the tracking error after compensation is about 83.67% less than the tracking error before compensation. This method can more effectively compensate the image delay, improve the system control bandwidth, and provide an effective idea for the high-performance tracking control of ATP system.
Fast detection of moving targets in long range surveillance LiDAR
Feng Jie, Feng Yang, Liu Xiang, Deng Chenjin, Yu Zhongjun
2023, 52(4): 20220506. doi: 10.3788/IRLA20220506
[Abstract](140) [FullText HTML] (29) [PDF 2939KB](41)
  Objective   Lidar is a kind of sensor using laser active imaging, with the advantages of high detection accuracy, all-weather working, easy access to high-precision three-dimensional information, far effective detection range, etc. It has been widely used in recent years, especially in the field of autonomous driving, as a three-dimensional environment perception device in the autonomous driving vehicles. When lidar is applied to perimeter surveillance and working in long-range mode, the target point cloud is relatively sparse which is different from microwave high-resolution imaging radar such as ISAR. The recognition speed of 3D point cloud data with the number of point clouds of 6 000-7 000/frame is lower than 12 frame/s when using training and real-time recognition of cooperative targets by deep learning method, while more missed alarms emerge. The rate of targets recognition needs to be improved. In order to guide the high-resolution infrared camera to carry out high-resolution fine imaging of the detected target before recognition, the method of fast detection of moving targets is investigated. The processing method of complex scenes using 3D Gaussian method and clutter map CFAR to detect moving targets is provided.  Methods   The flow diagram of lidar moving target detection based on 3D point cloud data is given (Fig.2), including 3D point cloud mesh construction, noise filtering by 3D bilateral filtering, target and background segmentation. The principles of 3D single Gaussian method and 3D Gaussian mixture method for segmentation of target/background are given, and the method of using clutter map CFAR detection is proposed (Fig.1). Using 72 frames of data from actual equipment, the result of application of the Faster RCNN Resnet50 FPN deep-learning method, two-dimensional single Gaussian method, three-dimensional single Gaussian method, three-dimensional Gaussian mixture method, and clutter map CFAR method are compared.  Results and Discussions  Comparative experiments show that the average accuracy rate of using the Faster RCNN Resnet50 FPN deep learning model is 0.318 4, the average recall rate is 0.329 4, the processing time of a single frame is 0.5 s, and the point cloud data is 2 s, which means this method is hardware-intensive and difficult to meet the general engineering requirements. In other methods (Tab.2), under the two-dimensional single-Gaussian model, the real-time performance is very high, but there are many false alarms, and almost every frame has false alarms. There are false alarms in some frames of 3D single Gaussian model (Fig.7). By adjusting the parameters of the 3D Gaussian mixture model, the number of false alarms can be reduced to 0 while there are no missed alarms (Fig.8). The false alarm rate will also decrease significantly after using the clutter map CFAR method (Fig.9). At the same time, it can be seen that the processing time of the clutter map CFAR method is basically the same as that of the 3D single Gaussian model method, which is much less than that of the 3D mixed model method, and can meet the actual engineering needs. The 3D Gaussian mixture model needs further optimization or parallel processing to improve real-time performance.  Conclusions   At present, when the deep learning method is directly used to detect and recognize moving targets for the lidar working in the remote monitoring mode, the real-time performance and detection rate can not fully meet the actual engineering requirements. The combination of lidar and high-resolution infrared camera in the project requires lidar to detect moving targets and guide the imaging and recognition of infrared high-resolution camera. Due to the high false alarm rate of two-dimensional single Gaussian method and three-dimensional single Gaussian method, it is difficult to adapt to complex background and cannot meet the requirements. Three-dimensional Gaussian mixture model can adapt to complex background very well, but the real-time performance is reduced because of the increase in the amount of computation caused by the update of background parameters. This means that it can not meet the requirements. In contrast, for the scene with complex background, the method of using clutter map CFAR to detect and process point cloud data can improve the accuracy and the real-time performance of detection, thus meeting the requirements of practical engineering.
Target recognition method of digital laser imaging fuze in ultra-low sea background
Meng Xiangsheng, Li Lekun
2023, 52(4): 20220548. doi: 10.3788/IRLA20220548
[Abstract](183) [FullText HTML] (42) [PDF 4468KB](30)
  Objective   In order to meet the tactical requirements of modern naval warfare, air-to-air missiles should be capable of intercepting ultra-low altitude sea-sweeping flight targets such as anti-ship missiles and cruise missiles. At present, the advanced level in the world has been able to fly at a height of 3 m above the sea. In this case, the complex sea background clutter will not only affect the detection and tracking of the missile guidance system to the target, but also enter the range of the fuze at the end of the rendezvous phase, which will seriously affect the fuze work, leading to false alarm or reduced starting ability. Therefore, improving the ability of fuze to resist ultra-low altitude sea background interference has always been a research focus to expand its battlefield adaptability. At present, conventional laser fuzes use multi-quadrant zonal wavegate compression, dual-beam detection and other technologies to suppress sea clutter, but each has certain application limitations. In this paper, a low altitude sea background target recognition method based on digital laser imaging is proposed. This method is based on the difference of imaging characteristics between the sea level and the physical target in the space distribution, and uses the fine recognition ability of laser imaging to the echo characteristics of different azimuth angles, which can improve the adaptability of proximity fuze to work reliably in the ultra-low altitude sea environment.  Methods   The dynamic sea surface laser echo simulation system is established to obtain the laser scattering characteristics of sea level and target. The simulation system can set the field angle parameters of the laser imaging system, and can obtain the echo signal characteristics under different intersection conditions and different sea conditions in real time. The scattering model of sea surface panel segmentation is used to calculate the laser echo distribution characteristics under different detection field angle parameters. The simulation flow chart is shown (Fig.1). Through the statistics and analysis of the distribution characteristic data of the laser scattering echo on the sea surface, a target recognition method based on the laser imaging system for low altitude sea background is designed and verified by simulation.  Results and Discussions   In terms of target characteristic simulation, for the imaging detection system that uses spatial narrow field of view subdivision, due to the undulation of the sea surface, the sea surface echo presents discrete flicker feature in the spatial distribution (Fig.5), which is significantly different from the imaging feature of continuous solid targets (Fig.6) in the spatial distribution. In terms of target recognition method design, a circumferential 360° solid-state array laser high-speed scanning detection system is proposed, and full digital echo signal processing is realized through high-speed AD sampling. According to the characteristics of high-speed rendezvous between missile and target, a method of low altitude sea background target recognition based on intra-frame judgment and inter-frame accumulation is proposed. This method can quickly filter out the sea background clutter by means of straight-through filtering, mathematical morphology filtering, target morphology features and other methods to ensure the real-time and reliability requirements of missile-borne detection and recognition. The average recognition accuracy of this method under different sea conditions is 96.9% (Tab.2) through simulation verification of different intersection conditions.  Conclusions   In this paper, a target recognition method based on digital array laser scanning imaging is proposed. This method can realize circumferential 360° solid state scanning detection through the time-sharing and high-speed operation of the electronically controlled array laser, and digitize the echo imaging features through high-speed AD sampling. It has the characteristics of fast recognition speed and high degree of digitalization, and can meet the real-time requirements of high-speed target recognition. The digital modeling of sea surface and laser detection has been carried out, and the optical reflection characteristics of sea clutter have been simulated and analyzed. Based on the target characteristics, a low-altitude sea background target recognition method of intra-frame judgment and inter-frame accumulation has been designed. Through simulation and test verification, the average recognition accuracy of this method under different sea conditions is 96.9% (Tab.2). The relevant technologies in this paper can provide methods and ideas for laser fuze anti-low altitude sea environment interference technology.
  • First
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Last
  • Total:10
  • To
  • Go